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Abstract 

Contributing to a rising number of Critical Data Studies which seek to understand and critically reflect on the increasing 
datafication and digitalisation of governance, this paper focuses on the field of school monitoring, in particular on digital 
data infrastructures, flows and practices in state education agencies. Our goal is to examine selected features of the 
enactment of datafication and, hence, to open up what has widely remained a black box for most education researchers. 
Our findings are based on interviews conducted in three state education agencies in two different national contexts (the 
US and Germany), thus addressing the question of how the datafication and digitalisation of school governance has not 
only manifested within but also across educational contexts and systems. As our findings illustrate, the implementation of 
data-based school monitoring and leadership in state education agencies appears as a complex entanglement of very 
different logics, practices and problems, producing both new capabilities and powers. Nonetheless, by identifying differ- 
ent types of ‘doing data discrepancies’ reported by our interviewees, we suggest an analytical heuristic to better 
understand at least some features of the multifaceted enactment of data-based, increasingly digitalised governance, 


within and beyond the field of education. 
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Introduction 


This paper seeks to contribute to the fast-growing body 
of Critical Data Studies by providing empirical insights 
into the pursuit of data, measurement and commen- 
suration in the field of public education. As in many 
other governmental spheres (for a recent overview see 
Smith, 2018: 3), the growing development of digital 
data infrastructures in education raises numerous ques- 
tions ‘[...] about the nature of data, how they are being 
produced, organized, analyzed and employed, and how 
best to make sense of them and the work they do. 
Critical data studies endeavours to answer such ques- 
tions’ (Kitchin and Lauriault, 2014: 1; see also Iliadis 
and Russo, 2016), while explicitly challenging the idea 
of data as being neutral or simply technical. 

In fact, there is a visibly growing body of work that 
describes the expanding datafication and digitalisation 


of education policy and practice, enhanced by the pro- 
motion of the so-called evidence-based governance (e.g. 
Bellmann, 2015; Grek and Ozga, 2010), including 
research that has explicitly focused on the production 
and processing of international student assessments 
(Bloem, 2016; Gorur, 2014; Lewis, 2017; Villani, 
2018). Nonetheless, the increasingly digital and auto- 
mated formation, recoding, storage, manipulation and 
distribution of data, all of which have become integral 
features of education governance (Hartong, 2016, 
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2018a; Landri, 2018; Sellar, 2015; Selwyn, 2014: 1; 
Williamson, 2017), have not yet been extensively exam- 
ined (see West, 2017 for an important exception), rep- 
resenting a ‘black box’ for most education researchers 
and practitioners. In other words, as described by 
Selwyn (2014: 13-14), there remains a pressing need 
to better understand ‘[...] how various forms of digital 
data are [specifically] set to work within educational 
contexts, including what data is used, what the uses 
and consequences are, and how data has become 
embedded within different organisational cultures’. 

With this paper, we seek to respond to this need by 
examining selected features of expanding data infrastruc- 
tures, flows and practices of school monitoring in three 
state’ education agencies in two different national con- 
texts, the Massachusetts Department of Elementary and 
Secondary Education (DESE) in the US and_ the 
Hamburg Department for Schools and Vocational 
Education (BSB) as well as the Institute for Educational 
Monitoring and Quality Improvement (IfBQ), an institu- 
tion attached to the BSB, in Germany. In both (federally 
organised) countries, the past two decades have witnessed 
either a tremendous turn towards (Germany) or a signifi- 
cant expansion (US) of datafication in education, pro- 
moted by a strong national coalition for evidence-based 
policy, which resulted in an extensive implementation and 
transformation of data infrastructures and _ flows 
(Anagnostopoulos et al., 2013; Hartong, 2016, 2018a). 
Simultaneously, state education agencies in both coun- 
tries have been urged to produce growing amounts of 
data and to use that data for more effective and efficient 
school leadership and monitoring, particularly but not 
limited to holding schools accountable for digital data 
production (Gonzalez-Sancho and _ Vincent-Lancrin, 
2016; Piattoeva, 2016).” As a_ result, state 
education agencies have increasingly focused on and 
restructured themselves around data production, analysis, 
management and reporting, thus illustrating what Smith 
(2018: 3) recently described as the ‘dataism’ paradigm 
reshaping the everyday business of multiple actors and 
agencies.” 

While the main goal of our study is to unpack for 
school monitoring what Kitchin and Lauriault (2014) 
describe as ‘data assemblages at work’, we also seek to 
contribute to a growing number of studies that focus on 
how the datafication and digitalisation of educational 
governance has manifested across educational contexts 
and systems. Notwithstanding a clear globalness in 
terms of ongoing transformations and thus broad com- 
monalities between datafication policies in various 
countries (e.g. Lingard et al., 2015; Williamson et al., 
2018), such examinations have also identified the 
significant influence of local contexts — including 
cultural, social or institutional settings — resulting in a 
significantly different ‘re/territorialisation’ of data 


infrastructures, flows and practices (Hartong, 2018a). 
Two of many examples are Schildkamp and Teddlie’s 
(2008) analysis of School Performance Feedback 
Systems in the US and the Netherlands, and a com- 
parative study on educational data production, avail- 
ability and use in China, Russia and Brazil by Centeno 
et al. (2018). The study presented here complements 
such analyses of digital technologies sitting ‘alongside 
pre-existing cultures and structures of educational set- 
tings’ (Selwyn, 2013: 209), while simultaneously filling a 
gap by focusing on a key, yet widely under-researched 
actor in the digitalisation of education governance so 
far, namely state education agencies and their role as 
‘data hubs’ between global, national and local data 
infrastructures and flows. 

The following section is devoted to further, yet brief, 
conceptual and methodological explanations before we 
explore the results of the study, principally drawing on 
16 interviews with 20 state agency experts conducted in 
Hamburg and Massachusetts between December 2017 
and April 2018.* Our particular emphasis hereby lies in 
documenting how the implementation of data-based 
school monitoring and leadership appears less as a 
purely technical procedure, but instead as a complex 
entanglement of very different (technical and social) 
logics, practices and problems. Specifically, we identify 
different types of ‘doing data discrepancies’, which, as 
we discuss in our conclusion, illustrate and conceptual- 
ise typical challenges associated with the pursuit of 
data, measurement and commensuration across many 
other domains of governmental or state activity, thus 
also offering important implications for the wider field 
of critical data studies. 


Conceptual and methodological framing 


The goal of our study was to better understand how 
datafication, in particular the ‘doing’ of school 
monitoring and leadership, has become enacted in 
two state education agencies across two country con- 
texts. Since we have already discussed this conceptual 
framing extensively in former contributions (Hartong, 
2018a, 2019), this section is limited to a brief summary 
of the main concepts employed: 

A central theme of our analysis is that of data infra- 
structures (or data assemblages, as used by Kitchin, 
2014) to describe and explore the networks of objects 
and subjects assembled around the socio-technical 
de- and recontextualisation of data in education 
(Anagnostopoulos et al., 2013: 8; Hartong, 2018a: 
135-136; Sellar, 2015; Williamson, 2017), here in state 
education agencies. In other words, attention is drawn 
to ‘[...] the technological, political, social and economic 
apparatuses and elements that constitute [...] and 
frame [...] the generation, circulation and deployment 
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of data’ (Iliadis and Russo, 2016: 2, citing Kitchin and 
Lauriault, 2014), to trace how data enter or are selected 
into a particular form, related to each other and, 
affected by these forms and relations, become informa- 
tion or governing knowledge (Sellar, 2017; Thompson 
and Sellar, 2018: 4-5). Thus, following Williamson 
(2017: 38-39) as well as Kitchin (2014), we understand 
data infrastructures not as static, but as practically 
enacted and constantly transformed by contingent, 
relational and contextual discursive and material prac- 
tices. In the case of state school monitoring, such prac- 
tices particularly refer to the collection, processing, 
modelling, management, visualisation or reporting of 
data, but also to the ways in which data is either pro- 
blematised or taken for granted. 

As our empirical observations will show, a key 
mechanism within the doing of state school monitoring 
is the fabrication of commensuration, which is the 
transformation of different qualities into comparable, 
usually quantified metrics (Espeland and Stevens, 
1998). At the same time, however, commensuration 
requires enormous organisation, decision-making and 
weighting which, as many of our interviewees reported, 
can pose significant challenges (which are mostly exter- 
nally invisible). The problem of fabricating commen- 
suration equally concerns software/coding activities 
and the embedding of these kind of activities into 
wider institutional practices (e.g. school support, 
accountability or reporting), which, to a large extent, 
means linking numbers to norms, values, and politics, 
and vice versa — for example by deciding which targets 
schools are expected to meet and, consequently, when 
to intervene as a state. As Diesner (2015) has argued, 
small decisions can thus produce a big (governmental) 
impact. Consequently, aside from an earnest attempt 
to understand the enactment of data infrastructures in 
state education agencies, we aim to unpack (and typify) 
at least some of these underlying, often ambivalent and 
difficult decision-making processes, including their poli- 
tical implications (e.g. profiling, social sorting, control 
creep, etc., see also Kitchin and Lauriault, 2014). 

With our exploration of state education agencies in 
Massachusetts (US) and Hamburg (Germany), we 
selected contrasting and simultaneously similar cases, 
resulting from a complex entanglement of different con- 
textual dimensions (Sobe and Kowalczyk, 2018). On 
the one hand, the US and Germany contain federal, 
multi-level architectures, where state education autho- 
rities to a large extent decide on the implementation, 
transformation and use of education monitoring sys- 
tems. On the other hand, both countries stand in 
stark contrast in terms of using and relying on (quanti- 
fied) data for educational governance. While in the 
former we find a strong traditional belief in the value 
of testing, rankings and the expertise of private test 


providers (Sacks, 1999), the latter has for a long time 
placed its faith more strongly in teachers, exerting 
‘[...] weak control and evaluation of the processes 
and almost no external control of the outcomes of 
schooling’ (Hopmann, 2003: 472). Even though 
Germany underwent a tremendous turn towards data- 
based school governance at the beginning of the 21st 
century (thus still qualifying as a global ‘latecomer’), 
this scepticism towards standardised testing and 
public rankings is still largely visible. In contrast, at 
least for the last 40 years, educational governance in 
the US has been characterised by the ever-growing 
importance of data-based accountability (Schildkamp 
and Teddlie, 2008: 262), further intensified by the so- 
called No Child Left Behind resolution in 2001 
(Anagnostopoulos et al., 2013; West, 2017). While an 
examination of doing monitoring in state education 
agencies needs to take into account these wider cultural 
and political contexts that frame ‘national imaginaries’ 
(Sobe and Kowalczyk, 2018) of datafication, the same 
is true for the different state contexts within both fed- 
erally shaped countries. In other words, the range of 
differences between state education agencies’ datafica- 
tion processes within the US and Germany may appear 
even larger than country differences. Responding to this 
additional complexity, this paper focuses on two states 
that, in relation to their state peers, present themselves 
as very advanced and experienced in terms of data- 
based school monitoring and leadership: 
Massachusetts and Hamburg. Furthermore, both rep- 
resent relatively small states in which data collection 
and centralisation appear less problematic than in 
more territorially extensive states. 

Methodologically, the presented findings draw firstly 
on material collected through extensive online research 
and document analysis, including organisation charts, 
policy papers, documentation on the development 
and usage of data instruments, as well as online data 
dashboards. Building on this initial research, we further 
conducted 16 semi-structured interviews with 20 state 
agency experts, ranging between 60 and 90 minutes 
each. We focused on the most relevant institutions con- 
ducting state-level data work related to school monitor- 
ing and leadership — the DESE in Massachusetts and, for 
Hamburg, the BSB as well as the IfBQ. We talked to as 
many ‘data experts’ as access allowed, operating across 
the fields of data collection, validation, modelling, stor- 
age, processing and distribution. 

Having concluded transcription of the interviews, we 
completed multiple reviews of the collected material, 
using topical coding as well as conceptual framing out- 
lining (Rivera, 2018: 8). We first reviewed sources for the 
two cases separately, annotating the text using codes 
referring to (a) the data infrastructure and flows in edu- 
cational monitoring and (b) descriptions of specific data 
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practices. We then combined the annotated text sec- 
tions from both cases in a new document, sorting, com- 
paring and typifying data infrastructures and practices 
described across the two cases/three agencies. Despite a 
wide range of topics, complexities, entanglements and 
narratives covered in the extracted text, we also induc- 
tively found what we subsequently coded and further 
analysed as ‘doing data discrepancies’. We typified par- 
ticular dimensions of such discrepancies assigned to cer- 
tain text passages. Finally, we generated two visualised 
heuristics that facilitated further refinement of our find- 
ings, which we turn to in the next section. 


Doing data-based school monitoring in 
state education agencies: Insights from 
Massachusetts and Hamburg 


While it must be recognised that minor variation 
exists, the technical infrastructure/process of state 
school monitoring in Hamburg and Massachusetts 
can be broadly summarised as follows (Figure 1). 
Firstly, numerous data points are digitally collected 
from school and/or student information systems (in 
the US, with districts acting as data mediators) within 
varying time frames (from annually to daily). 
Submitted data is then validated, using a combination 
of automated and human checking processes, before 
being centrally stored (either in a data warehouse or 
in an oracle database). From there, different depart- 
mental units make use of data for modelling, analysis 
and/or data visualisation aligned to different data tools, 
while also working with external/internal research 
experts. Finally, the laboriously edited data is widely 


reported, both publicly and within different portals 
used by schools, parents or (in the US) districts. 

In general, most of our interviewees were well aware of 
the complexities behind this technical infrastructure, 
which might look straightforward on paper, but in prac- 
tice includes various interdependencies and requires data 
to flow back and forth multiple times. In fact, most inter- 
viewees contrasted their work around data with linear 
procedures or loop circle models (as the technical infra- 
structure would suggest), instead describing it as highly 
experimental, involving significant elements of ‘messing 
around’ or, as one interviewee phrased it, ‘cooking’ with 
multiple ingredients (data, algorithms or models) to find 
working solutions within a highly diverse entanglement of 
often very different logics, stakeholders or problems.° 

In line with this argument, interviewees reported 
that it has become increasingly difficult to organise 
and work internally with growing amounts of data, 
which also means an increasing dependence on particu- 
lar programs, algorithms or indices — with consequent 
effects due to their selectivity. As one DESE actor in 
Massachusetts reported: 


[It’s the program that specifies what you’re doing to 
the data. It says filter these things out, count that and 
don’t count this and add these things, but don’t add 
those things. It’s literally the program, the query that 
pulls across the data and so on.° 


Similarly, a German IfBQ actor stated: 


An index value always expresses a particular back- 
ground question, specific method-related consideration. 
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Figure |. The technical infrastructure of state school monitoring. 
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And in fact this partly determines how to look at 
this data.” 


At the same time, in both countries data practices share 
a new kind of data economy, including attempts to 
reduce ‘unnecessary’ data, to eliminate data duplication 
or alternative data expressing the same thing or to 
automate and accelerate data collection using inter- 
operability standards, centralised platforms and data 
business rules. 

Framed by these more general narratives of both 
data expansion and data reduction, we identified differ- 
ent discrepancies about which our state agency actors 
raised concerns when describing their data work, both 
in a more narrow or technical sense (writing algorithms, 
building models for calculation, linking data) and 
also in a wider sense (embedding such technical prac- 
tices into the wider contexts of school monitoring 
and leadership) (Figure 2). We discuss the most fre- 
quently reported discrepancies, which all share 
political implications, in greater detail within the 
following sections. 


Data simplification versus data accuracy 


A central goal of state education agencies is to nudge 
schools, teachers, parents or the wider public in the dir- 
ection of using (their) data more frequently and to 
improve data-based communication. However, at the 
same time, our interviewees were well aware that many 
of their addressees lacked the time, expertise or motiv- 
ation to dedicate much time to understanding, using 
and interpreting the (rising amount and complexity 


of) data in the ‘right’ way. As one DESE actor in 
Massachusetts expressed: 


What we’re trying to do right now is expand our out- 
reach because we know that there’s a huge opportunity 
for parents and kids, other audiences to use this data 
but they’re not going to have as much experience with 
data, they are not going to be the ones to download it 
and put it into Tableau and run analytical reports to 
figure out which school has the best support pro- 
gram.[...] [W]e are trying to [...] really work [...] on 
data visualisation and doing more actionable data 
with less interpretation of the data. So that we do it 
so that parents, we can reach that audience that is not 
data experts. 


Another interviewee from IfBQ in Hamburg made a 
similar point: 


It makes much more sense to arrange data in a way that 
takes less effort to use. Instead one can instantly draw 
on it, show things. This [...] map is designed to be 
printed in any format ready to use for presentations. 


As interviewees in both countries reported, the easier 
and more ready-to-use the data (e.g. using visualisa- 
tions that tell a clear story), the better it can be ‘under- 
stood’ and the more it will also be used by non-experts. 
We found various examples of such explicitly simplified 
and condensed data instruments (designed to further 
prevent schools from ‘drowning in data’, as one 
agency actor suggested), including one-(web)page data 
summaries for each school (e.g. the school Profiles or 





Data Expansion 


Discrepa NCIES Contextualization 


Data 


Global, National and Local net 
Validity 


Data Infrastructures 
and Flows 








Commensuration 


Data Data 
Sn Simplification 
NS. / * ie ‘ 
 { Monitoring ‘ z 
Data , + *, in state education /*—? Data 
Transparency a agencies 4 /3 Security 


Veto eanl@ 


Acceleration 





Accountability 


Data Reduction 


< 


Data 
Cultural, Social and 


Institutional Settings 








Figure 2. Doing monitoring in state education agencies. 
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DART (District Analysis and Review Tools) instru- 
ments in Massachusetts or the School at a Glance/ 
Schule im Uberblick — SchUb instrument in Hamburg) 
as well as the expanding usage of maps, graphs and 
traffic light systems. 

This user-friendly simplification, however, 1s accom- 
panied by a significant risk of neglecting the multiple 
possible interpretations of data, intended to be viewed 
in a context-sensitive manner. In other words, a key 
issue raised by a number of interviewees was what 
they described as an unfortunate discrepancy between 
the demand for simplification and a simultaneous 
demand for data accuracy, which appeared just as rele- 
vant for data communication as for data production 
and processing. 

As an example, the person responsible for the Early 
Warning Indicator System® in Massachusetts said: 


Someone was looking at [the data and said] maybe I 
should use this and start to encourage kids who are at 
high risk of not going to college or not persisting at 
college to advise them away from post-secondary. We 
were like, no! That’s exactly the opposite of what this is. 


Unsurprisingly, the problem of how to prepare and 
communicate data with the ‘right’ amount of complex- 
ity increases with every further expansion of potential 
data audiences that each demand different forms of 
data preparation, processing and visualisation. For 
example, an interviewee from Massachusetts said: 


I’ve been a little worried that we have so many data 
tools, as you’ve seen [...] and we develop them in so 
many different ways and we deploy them in so many 
different ways and I’m worried [...] that we are not 
clear on what these are all for, who should use which 
ones, what’s the right audience and that kind of thing. 


Similarly, an IfBQ actor from Hamburg stated: 


[This instrument] is not really user friendly, because it 
offers something for everybody, which means it actually 
doesn’t offer the right thing for anybody. 


Consequently, what many of our respondents aspire to 
for the future is a further development of customisable, 
interactive and flexible data instruments, which offer 
various, user-related options for adaptation, thus com- 
municating the complexities of data without losing 
attractiveness. 


Who to compare? 


Closely related to the problem of how (strongly) to con- 
textualise data, a key component of doing monitoring 


for school monitoring and leadership lies in commen- 
suration, which is making particular things as data 
comparable to other things as data. We found such 
commensuration practices to be highly relevant for all 
dimensions of doing monitoring and for the enactment 
of data infrastructures in state education agencies, 
while the questions of who is made comparable with 
whom and which metrics inform the fabrication of 
comparisons also imply significant challenges. 

One challenge comes along with what our respond- 
ents reported as the increasing adaptation of the so- 
called ‘fair comparisons’. Different from comparing, 
for example, a school’s performance to the performance 
of neighbouring schools or a student’s performance to 
peers in his/her class, fair comparisons instead allow us 
to relate a particular performance result to the schools/ 
students across the state that are the most statistically 
similar, thus promising a better (fairer) and context- 
related understanding of data. Such de-territorialised 
forms of comparison, which have been made possible 
by data centralisation, interoperability and standardisa- 
tion, are not limited to measuring and comparing per- 
formance data, but instead have increasingly become 
part of all kinds of data tools used by state education 
agencies. However, while this growing reference to stat- 
istical ‘context’ has introduced a new (‘fairer’) dimen- 
sion of data contextuality, it has also further 
complicated the question of how much (territorial or 
statistical) context is needed to properly understand 
and use data (see last section). In other words, inter- 
viewees suggested that is has become increasingly diffi- 
cult to determine how many and which comparison 
options should be ‘offered’ to data users or directly 
built into data tools. 

In Hamburg, we found such challenges reflected 
in the fabrication of a social index (Hamburger 
Sozialindex) to classify the socio-economic status of 
schools, which was to be used not only to determine 
state-provided resources for that school but also to 
statistically calculate peers for evaluating test perform- 
ance (then named ‘fair comparison’). Interviewees 
working with this instrument reported that using such 
an adjusted feedback method can foster data accept- 
ance and can help schools to properly evaluate their 
own performance. At the same time, however, IfBQ 
actors also highlighted that school index classifications 
were developed in 2012 and six years later were out- 
dated, thus causing some schools to feel inadequately 
represented in data that still informs meaningful deci- 
sion-making. Responding to this feedback, Hamburg 
state education agencies for a while (but not anymore) 
offered an alternative on-demand evaluation to adjust 
the index data. 

In Massachusetts, an instrument reflecting the dis- 
crepancy between territorial versus _ statistical 
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contextualisation is the so-called Resource Allocation 
and District Action Reports (RADAR) instrument. 
RADAR collects and models various district-level 
data with a focus on improving finances and spending. 
An important part of RADAR is the option for districts 
to make sense of their data by comparing themselves to 
up to 10 other districts, including those which are terri- 
torially far away but similar, e.g. in demographics or 
student performance. However, as one DESE actor 
reported, getting districts to use different (fairer) forms 
of data contextualisation has proved challenging: 


[...] [MJost districts will pick at least a couple of com- 
parison districts from those right around them. Then 
we had thought that our DART list [DART = District 
Analysis and Review Tools, a different data tool on 
school and district performance] we’ll give you our 10 
based on demographics, we thought this was great, 
because it let districts know [...] [there are schools] 
that they might not realise that were across the state 
that they had never heard about but are so similar to 
them. They hated that. They did not consider districts 
far away to be good comparisons. 


Hence, what DESE now does instead is offer different 
options for either territorial or statistical comparison, 
again using customisation and flexibilisation to let data 
users decide who they would like to compare them- 
selves to (see www.doe.mass.edu/research/radar). 

Both examples presented highlight the complexity 
underlying increasing options for commensuration 
using either territorial or statistical relation-making, 
or indeed trying to find a balance between the two. 


Speeding up data production while improving 
data validity 


Another key problem of doing monitoring within state 
education agencies is improving data validity while 
handling (public or political) expectations to produce 
data more rapidly, frequently or, at best, in real-time. 
The majority of our interviewees expressed concern 
about this tension and the challenges of developing solu- 
tions to deliver data more quickly or, at least, ‘fast 
enough’ (e.g. for publishing educational monitoring 
reports by the time they are needed to inform decision- 
making), while still ensuring data quality and validity. 
In both cases, state education agencies are dealing 
with this issue by setting a specific (yet not overly exten- 
sive) timeline for data validation practices, including a 
deadline which defines the moment at which data 
becomes ‘frozen’. In other words, at that point in 
time (also described as the ‘single point of truth’) data 
in the system is perceived to be correct and is further 
processed into reporting or additional data modelling, 


while further data changes are permitted. As one DESE 
actor described it: 


[...] [O]nce they certify it, once every district says okay, 
I’m certifying this data and we compile it, I think of it 
as like the big steel door shuts, that’s it. [...] You can’t 
go back [...] because once you have that data and you 
start reporting it out, it’s reported out in so many 
places. So we don’t have control of all that anymore. 


A similar view was expressed by a BSB actor in 
Hamburg: 


Well, one has to accept that it’s actually the right thing 
not to correct [it] anymore because you already reported 
the data to the KMK [Standing Conference of the 
Ministers of Education of the German States] after all, 
the senator launched the numbers at the press confer- 
ence. [...] This data shouldn’t be changed because it 
would open up an already published stage of affairs. 


Given the importance of this point of literally no return 
(see exceptions below), state education agencies in both 
countries are making massive investments in improving 
data validation within the provided deadline. Such vali- 
dation includes (both automated and human) data 
checking practices within the agency,’ data business 
ruling, but also (technical) barriers to prevent schools 
from reporting incorrect data using error reporting, 
which, however, has caused a different problem, as 
one DESE actor described: 


What’s the bigger problem is [the schools] [...] fix [their 
data] [.. .] in order to get the submission through, but they 
never go back to fix it in their system so that the next time 
they don’t get those errors, is probably the bigger problem. 


In other words, state education agencies need to create 
more interlaying instruments for downstream data vali- 
dation processes that are also able to remove errors step 
by step in the source systems. Still, however, strict time- 
lines are reported to complicate this process, as one 
DESE actor responsible for the collection and process- 
ing of assessment data’® reflected: 


So we fix these [...] [data errors] from June while we’re 
scoring the essays over the summer. So it takes four 
weeks or five weeks to score 2 million essays, 3 million 
essays we are scoring and know the response from that. 
While that’s going on [...] [the schools are] fixing the 
data. [...] In September we have official results, and it’s 
almost always perfect. 


Despite this growing investment in data validation, data 
‘mistakes’ have not vanished completely. Interestingly, 
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referring to such cases, several respondents in fact ques- 
tioned the single point of truth and suggested that 
‘wrong’ data should also be corrected after this point, 
at least if the data is important. For example, one actor 
from BSB drew the following distinction: 


[...] [U]nfortunately one has to say, you only have a 
certain period of validation. So the data is also looked 
at in detail afterwards. If something slips through, 
which regrettably occasionally happens and is very 
important, the data warehouse and the single point of 
truth will be changed in retrospect. 


Similarly, the assessment expert at DESE said: 


[...] [YJou live with the one or two errors for school 
reporting but at the student level we don’t live with it. 
We still fix it, even now especially for high school because 
you can’t graduate unless you pass the test. [...] We will 
go back. Two years later someone says: I know I passed 
that test, how come I can’t graduate? Okay. So we’ll go 
down and we'll do handwriting analysis [...]. For gradu- 
ation and sometimes for scholarships that we also give 
because of the tests, we have one person who is always 
working on forensic examinations. 


Unsurprisingly, data that has already been reported 
(publicly) is significantly less likely to be changed, 
which in some cases means ‘living with errors’ in 
order to preserve data coherence. 


Increasing both data transparency and data 
security 


A fourth discrepancy which was frequently reported by 
our interviewees in both countries was the problem of 
simultaneously increasing demands for data transpar- 
ency and data security. 

Particularly in the US, where a much stronger value 
has traditionally been placed on publicly available data, 
respondents strongly supported publishing as much 
data as possible: 


I’m a believer in publishing as much data as we possibly 
can. I think that the education community, or even just 
the public at large, has become much more fluent in 
being able to look at data and understand data. I [...] 
believe in that you can effect change by just purely 
publishing data and letting people see it. Because then 
I think people will start asking questions about it and 
that’s a form of accountability. We publish an awful lot 
of data and it’s sort of a philosophy that says, “Let’s 
put as much out there as we can and let people harvest 
or digest whatever the appropriate level that’s right for 
them.” That’s sort of the general idea. 


Still, however, actors at DESE were also well aware 
that some data is perceived to be too sensitive for pub- 
lication, either for student data security reasons or in 
order to prevent an unintended overreaction, as one 
interviewee stated: 


There are plenty of people who go overboard and they 
overreact to the data that they don’t really understand 
sometimes and they make decisions, even though it’s 
good data you can make a bad decision with good data. 


As two other DESE actors outlined, such overreactions 
are particularly likely in high stakes contexts, such 
as student performance results and their effect on 
schools’ autonomy,'’ parental school choice or real 
estate markets: 


[...] [T]he public looks at these and you may think that 
a 1% difference from the third grade this year and read- 
ing results so next year it goes down 1% you would 
think, so what? That’s statistics. I live in a town where 
that 1%, people will get concerned about that. 

I would worry about there’s tons of ways to use this 
data inappropriately, either tracking them or shaming 
students and worry about some of that. So I want to 
make sure that folks who are using it are using it in a 
way that helps them. 


Consequently, also US interviewees reported that they 
were extremely careful in deciding which data is made 
available to whom (often using data security roles for 
selective database access), particularly in the case of 
parents: 


We don’t have a parent portal, that’s always been 
talked about, there are lots of really tactical implica- 
tions around how to ensure that the right people have 
access to it. 


Given that Germany has been much more sceptical 
about publishing data (which is still more strongly 
regulated than in the US), the relation between pub- 
lished data (e.g. the aforementioned social index) and 
for example parental school choice is much less prom- 
inent, as one IfBQ actor explained: 


One could suspect that [relation]. We never really 
examined that. But these are topics and things [school 
choice decisions] that are mostly talked over privately, 
unrelated to the social index. Not even all parents know 
about that instrument. [...] Parents have a particular 
idea of school anyways [...] one could assume that 
some aspects of parental school choice are influenced 
by that. But that is something you would have to take a 
closer look at. 
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Nevertheless, our respondents from Hamburg also 
expressed a growing pressure to make data available 
and cited schools worrying about the potential future 
usage of their data for accountability purposes: 


As soon as schools notice something is done with their 
data, you should carefully consider whether you do it 
or not. [...] Schools are quite sensitive in this regard. 


Finally, in both countries, given the growing promi- 
nence of data-based school governance, data security 
has become an increasingly political issue, resulting in 
new data protection laws and regulations. As one IfBQ 
actor said: 


Data protection has become much more important 
over the past few years, compared to 10-15 years 
ago, because there are many more possibilities and 
there is much more data available than a few years 
ago. Hence, it’s important to describe methods and 
processes very clearly, to define the conditions for 
linking data. 


In both countries, a common way of dealing with these 
ambivalent requirements has been to introduce com- 
plex systems of data pseudonymisation, which allows 
a great deal of (linked) data to be processed and pub- 
lished without identification. 


Proactively considering the ambivalent effects of 
accountability 


As mentioned in the case selection part of this paper, the 
US and Germany stand in stark contrast in terms of 
their attitudes towards using and relying on (quantified) 
data — particularly for accountability purposes — with 
Germany being much more sceptical towards published 
data, standardised testing and the use of high-stakes 
rankings. Having said this, while using data for account- 
ability was still much less of an issue in Hamburg (yet 
this seems to be gradually changing, see below); in 
Massachusetts it was frequently mentioned as a key fea- 
ture of doing data, which simultaneously, however, 
appears to intensify the aforementioned discrepancies. 
DESE actors were thus well aware that using data to 
enforce accountability and build accountability models 
is strongly influenced by the norms and values used as 
underlying benchmarks: 


[...] [W]e’ve done some pretty deep philosophical dis- 
cussions when we are debating what indicators to 
include, how much improvement we should expect to 
see and it’s an interesting balance of this technical side 
and then this normative side. Because in the end, 
you’ve got to say, did this school make it or not? 


Modelling accountability (data) consequently poses 
additional challenges, as, on the one hand, the data 
and models are still intended to be used by schools 
for their own improvement. On the other hand, the 
stakes attached to accountability data enhance new 
dynamics related to the model, such as ‘blind’ perfor- 
mativity or gaming. Several DESE actors expressed 
concerns about schools manipulating data in response 
to accountability measures, with such a risk strongly 
affecting how they do data in this field: 


From the accountability perspective, it could be tricky 
to put [particular] [.. .] kind[s] of measures in account- 
ability. Like, for example, we’re piloting a school cli- 
mate survey. We could down the road consider putting 
that in, but you create these centres for teachers or 
principals or whoever to tell the kids, make sure you 
fill all those out, the top possible score in a way that’s 
harder to do with an assessment, like it’s harder to 
manipulate assessment graduation rates. So that’s 
where I think the conversation gets more challenging, 
is around using accountability. 


As mentioned above, despite the officially remaining 
scepticism against high-stakes accountability, even in 
Germany we found that the rising amount of available 
data and the use of that data for resource distribution, 
state-school target agreements or school support (via 
‘close consultation’ and ‘continuous data feedback’, 
as one actor described it) has in fact significantly 
increased the use of data for accountability purposes 
(which is not simply low-stakes anymore). Two state- 
ments nicely illustrate this visible discrepancy. One 
IfBQ actor said: 


Political decisions won’t be linked to that data [...]. 
Not like in the US where schools can be closed [based 
on accountability scores]. [...] We don’t think about 
this. And I think that’s good. 


Later in the interview, the same actor said: 


There are a lot of schools that, from our view, are doing 
a good job on [...] [using data]. They look at the 
results, discuss them [...], build on them for school 
development. Schools have also got their target agree- 
ments with their school supervision agency, where such 
topics are discussed as soon as they notice that they 
perform lower than comparable schools. They say 
they want to change something, want to improve 
some student groups who perform badly. Many schools 
do a great job on that. 


Finally, in Hamburg, interviewees also expressed con- 
cern about schools gaming data collections that are 
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linked, in particular, to resource distribution, such as 
the aforementioned social index. For example, inter- 
viewees at the IfBQ reported that some schools were 
worrying that other schools could manipulate their 
social index by reporting more students with low 
social-economic status than were actually attending in 
order to receive more funding. 


Concluding remarks 


The aim of this paper was to provide empirical insights 
on how state education agencies in Germany and the 
US enact the rising datafication of schooling, looking, 
in particular, on data infrastructures, flows and prac- 
tices for school monitoring. Our findings have thus illu- 
strated some selected features of this ‘doing’ monitoring 
as reported by actors from three state education agen- 
cies in two different national contexts. Even though our 
findings generally reveal what Selwyn (2013: 198) has 
described as the ‘messy’ realities of technology and edu- 
cation, we still identified different types of ‘doing data 
discrepancies’ that present somewhat typical challenges 
and ambivalences described by our interviewees in both 
countries in surprisingly similar ways. This is despite 
the fact that our respondents from Hamburg continue 
to articulate strong criticism of (public) rankings and 
high-stakes testing. Nonetheless, we found contextual- 
ity to be highly reflected in the doing of data, for exam- 
ple as evident in the example of making data publicly 
available. It is important to mention once more that 
both selected cases — within their country context — 
represent advanced states with a comparatively (in rela- 
tion to their state peers) long history of datafication 
and an extensive use of data for monitoring and 
school leadership. However, since we also found 
many of our findings reflected in other states (both in 
Germany and the US), we assume our conclusions to be 
of significant cross-context relevance. 

At the same time, the discrepancies reported by our 
interviewees show how the social and technical are not 
only deeply interwoven with data-based school moni- 
toring, but also — as emphasised in the existing critical 
data studies literature — how data practices always have 
political implications, particularly when applied to sys- 
tems of (high or low stakes) accountability. As Kitchin 
and Lauriault (2014: 4-5) state, data infrastructures are 
always ‘[...] expressions of knowledge/power, shaping 
what questions can be asked, how they are asked, how 
they are answered, how the answers are deployed, and 
who can ask them’ (see also Ruppert et al., 2017). In 
other words, monitoring infrastructures create what 
West (2017: 1) describes as limited ‘[...] second-hand 
representations of important objects of analysis’ that 
administrators use to speak on behalf of the school, 
the teacher or the student. As our findings illustrate, 


such representations not only create new categories 
(e.g. students ‘at risk’ or schools ‘in need’) but make 
things (in)visible or predictable in particular ways, 
ultimately having the potential to change ‘[...] the 
essential qualities of what is being studied’ (West, 
2017: 10-11). While new information thus continuously 
produces the need for more and better information 
(Thompson and Sellar, 2018), it simultaneously 
increases the amount of data management, including 
data about data production (evident for example in 
the expansion and evaluation of data validation and 
data business rules) (Piattoeva, 2016), ultimately shift- 
ing more and more attention towards ‘valid’, ‘fast’ or 
‘usable’ representation-making. 

Nonetheless, a key result of our study is that datafi- 
cation, at least in our selected state education agencies, 
does not appear to produce single centres of calculation 
and data power, but is instead mediated through mul- 
tiple infrastructures and practices that together perform 
calculation, commensuration and data work. Against 
this backdrop, we fully agree with Gray et al. (2018: 
1) that instead of (only) calling for data literacy in the 
sense of competencies in reading and working with 
datasets, there is a pressing need for so-called data 
infrastructure literacy, which is ‘[...] the ability to 
account for, intervene around and participate in the 
wider socio-technical infrastructures through which 
data is created, stored and analysed’. This, however, 
requires close empirical observation of data infrastruc- 
tures at work, not only in the field of education but also 
with regard to wider issues of governance, data-driven 
policy-making and the organisation of the state. In this 
regard, our study, and particularly our attempt, to both 
visualise and (at least partly) typify the data infrastruc- 
ture of doing monitoring in state education agencies, 
while specific in its contribution, might also be applic- 
able to other policy fields. 
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Notes 


1. We use the term ‘state’ here to describe subnational units 
of educational authority, as commonly used in the US. It 
seems important to note that such subnational authorities 
are usually named Ldnder in Germany. Yet, we use the 
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US term here for the purpose of alignment between the 
cases. 

2. For a closer examination of the policy context around 
expanding datafication in both countries see Hartong 
(2018b). 

3. Even in the US, where districts have traditionally been 
charged with monitoring schools through data, state edu- 
cation agencies have increasingly become the main source 
for educational data because they not only collect data 
from every student and teacher in every school and every 
district of the state but can also make comparisons using 
these data. 

4. The interviews we draw on for this contribution are part 
of a larger project, funded by the German Research 
Foundation (DFG, project number HA 7367/2-1), which 
seeks to improve the understanding of digital-era govern- 
ance and the role of data management in education 
within the policy contexts of Germany and the US. The 
project includes analyses of (1) policy material, such as 
monitoring regulations, resolutions or digitaliUsation/ 
datafication programmes, (2) the actors and institutions 
involved in performance data management at national 
and state level, using policy network analysis, (3) the 
performance data infrastructures and their modes of 
operation in selected state education agencies in each 
country, as well as (4) interviews with national and 
state-level policy actors, technicians, administrators and 
data system providers. 

5. At the same time, several interviews clearly pointed out 
that there is no and will never be ‘perfect data’ mirroring 
‘the truth’. As one interviewee stated: ‘Any dataset that 
you think is perfect, you just haven’t looked at it close 
enough’. 

6. To aid legibility and understanding, interview quotes were 
slightly altered both linguistically and grammatically. 

7. German interview transcripts were translated 
English by the authors. 

8. For an overview of these kinds of data tools, see Neild 
et al. (2007). 

9. Interviewees reported that this checking process is usually 
directed at finding non-sense data (e.g. a seven-year-old 
student attending 12th grade), data doubles, data incon- 
sistencies or missing data. 

10. The assessments are increasingly carried out online, so 
much more data is collected automatically. However, 
free text parts are still coded manually. 

11. As an example, schools performing poorly in account- 
ability measures can be taken over at least temporarily 
by DESE, see also below. 
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